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Abstract 

A variance swap is a derivative with a path-dependent payoff which allows investors to 
take positions on the future variability of an asset. In the idealised setting of a continuously 
monitored variance swap written on an asset with continuous paths it is well known that 
the variance swap payoff can be replicated exactly using a portfolio of puts and calls and a 
dynamic position in the asset. This fact forms the basis of the VIX contract. 

But what if we are in the more realistic setting where the contract is based on discrete 
monitoring, and the underlying asset may have jumps? We show that it is possible to derive 
model-independent, no-arbitrage bounds on the price of the variance swap, and correspond- 
ing sub- and super-replicating strategies. Further, we characterise the optimal bounds. The 
form of the hedges depends crucially on the kernel used to define the variance swap. 

1 Introduction 

The purpose of this article is to construct hedging strategies which super-replicate the payoff of 
a variance swap for any price path of the underlying asset, including price paths with jumps. 
The idea is that at initiation time 0, an agent purchases a portfolio of puts and calls which 
she holds until time T. In addition, she follows a simple, dynamic investment strategy in the 
underlying over [0, T]. Then, for every possible path of the underlying, the sum of the payoff 
from the vanilla portfolio plus the gains from trade from the dynamic strategy is (more than) 
sufficient to cover the obligation from the variance swap. Implicit in this set-up is the idea that 
the super-hedge does not rely on any modelling assumptions. Instead, the super-hedge is robust 
even in the presence of jumps. 

The problem of finding the cheapest super-hedging strategy can be seen as the dual of a 
primal problem which is to bound the prices for variance swaps over the class of all models for the 
asset price process which are consistent with the traded prices of puts and calls. If the variance 
swap is sold for the price upper-bound and hedged with the corresponding super-replicating 
strategy then the seller will not lose money under any scenario. 

The model-independent approach should be contrasted with the standard methodology 
which begins with a stochastic model for asset prices, and then infers the price of the vari- 
ation swap by calculating the expected payoff. However, in markets where vanilla instruments 
are liquidly traded, the prices of puts and calls contain information about the market's expec- 
tations of the future behaviour of asset prices. The existence of this information removes the 
need to model the future, and this fact forms the basis of the model-independent approach. 

In addition to super-hedges and upper bounds on the price of the variance swap we also 
give sub-hedges and lower bounds. Moreover, our analysis is not restricted to any particular 
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definition of the variance swap, nor is it based on a mathematical idealisation of a continuous 
time limit of the swap contract, but rather on a discrete set of observations. We define variance 
swaps through their kernels; bivariate functions with regularity properties making them suitable 
to measure variance properties of the price path. Examples of kernels include squared simple 
returns, squared log returns and squared price differences. Furthermore, the sub- and super- 
replicating hedges work for discretely sampled variance swaps and continue to work in the 
continuous time limit. As long as the price path has a quadratic variation, these limits exist 
by Follmer's path- wise Ito formula [17]. The standard approach to variance swap pricing is to 
assume a stochastic model and that the underlying paths are generated from a semi-martingale 
process with respect to this model. In this article, a model is specified only when it is necessary 
to show that the cheapest super-replicating hedge is tight. 

Under some minimal restrictions on the form of the variance swap kernel we find a family of 
super- hedging strategies. This family is parameterised by a set of monotone functions. Then, 
given that the prices of call options for the expiry date of the variance swap are known (or 
cquivalently the marginal law of the underlying price process at maturity is known) we show that 
there exists a cheapest super-replicating hedge from the given family. This hedge is associated 
with a monotone function, and we use this function to describe a stochastic model for the 
forward price of the asset in which the price process is continuous, except perhaps for a single 
jump, after which the process remains constant. In the continuous time limit, the super-hedge 
replicates the payoff of the variance swap if the asset price follows this one-jump model. This 
shows that the bounds we produce are best possible and justifies the restriction of our search 
to hedging strategies within the given family. 

This article shares the model-independent ethos for the pricing of variance swaps implicit in 
Neuberger [25] and Dupire [16] in the setting of continuous price processes. In those articles, it 
was shown that if we assume that the asset price process is a continuous forward price, then the 
continuously monitored variance swap based on either squared log returns or squared simple 
returns is perfectly replicated by the following strategy: synthesise —2 log contracts using put 
and call options and trade continuously in the asset to hold a number of shares equal to twice 
the reciprocal of the current asset price at all times. We will refer to this strategy as the classical 
continuous hedge. By results due to Breeden and Litzenberger [4], it is possible to approximate 
any sufficiently regular payoff with vanilla options. As a special case Demeterfi et. al. [14] show 
how to approximate the log contract with a finite range of vanilla options. It follows that in 
the setting of a continuous forward price, the unique no-arbitrage price for the variance swap 
is equal to the price of the contract with payoff equal to —2 log contracts. This result holds 
independently of any modelling assumptions beyond path continuity. The hedging strategies 
in this paper are of the same character, consisting of a static position in calls and puts and 
dynamic trading in the underlying. However, the underlying setup is considerably more general, 
and the results more powerful since the hedges continue to super-replicate the variance swap for 
discontinuous price-paths and discrete monitoring over arbitrary time partitions. Nonetheless, 
this increase in generality comes at a cost in that instead of a replicating strategy we get sub- 
and super-replicating strategies and instead of a unique no-arbitrage price we get a no-arbitrage 
interval of prices. 

As is well known, the model-independent analysis of derivative prices is related to the con- 
struction of extremal solutions for the Skorokhod embedding problem. This relationship was 
first developed in Hobson [18], see Hobson [19] for a recent survey, and exploits the idea that 
the classification of martingales with a given terminal law is equivalent to the classification of 
stopping times for Brownian motion, such that the stopped process has that given law. As we 
shall see, the monotone function which is associated with the cheapest super-hedging strategy 
arises in the Perkins solution [27] of the Skorokhod embedding problem [29]. For another ex- 
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ample of model independent pricing and the connection between derivatives and the Skorokhod 
embedding problem in the context of variance options, see Cox and Wang [11]. In the setting 
of continuous price paths Cox and Wang [11] give bounds on the prices of call options on re- 
alised variance by exploiting a connection with the Root solution of the Skorokhod embedding 
problem. 

In a recent paper [23], Kahale shows how to derive a tight sub- replicating strategy and 
corresponding model-independent lower bound for the price of a variance swap based on the 
squared log return kernel. The paper by Kahale was an inspiration for our study which grew 
from an attempt to relate his work to the previous literature on model-independent bounds and 
the Skorokhod embedding problem. By framing the problem in this way we extend the results 
of Kahale [23] to other kernels, and give upper bounds as well as lower bounds. Moreover, in 
the case of squared returns where the connection is particularly explicit, we explain the origin of 
the extremal models, and we give a natural interpretation for some of the quantities appearing 
in [23] in terms of the Perkins embedding of the Skorokhod embedding problem. The analysis 
of the squared returns kernel motivates our general approach to variance swap bounds and 
links this work to previous results of the authors (Hobson and Klimmek [20] ) on characterising 
solutions of the Skorokhod embedding problem with particular optimality properties. 

Also, we give an interpretation of the continuous time limit of the bounding strategies 
using Follmer's path- wise ltd calculus [17]. Follmer's non-probabilistic Ito calculus has been 
used elsewhere in mathematical finance, most notably by Bick and Willinger [2], and helps 
emphasise the fact that the gains from trade have an interpretation as (the limit of) Riemann 
sums. 

One of the features of our analysis is that we study the variance swap under a variety of 
definitions for the contract. Early definitions of the variance swap were based on squared simple 
daily returns. Accordingly, the first analysis of the discrepancy between the classical continuous 
hedge and realised variance in the presence of jumps, which is due to Demeterfi et. al. [13], 
focused on this kernel. Later, the finance industry switched to a standardised definition based 
on log-returns. (These contracts are typically sold OTC, and therefore any specification of 
the contract, and any observation frequency is possible.) In their comprehensive survey of the 
literature on variance derivatives Carr and Lee [8] give a plausible reason for this change based 
on the fact that banks tended to be buyers of variance swaps. Conventional wisdom states that 
downward jumps are more frequent than upward jumps and, in contrast to the situation for 
squared simple returns, for the squared log- return the contribution of downward jumps to the 
value of the variance swap is positive. Hence a switch to the log return definition was profitable 
to the banks. 

This conjecture about the history of the variance swap illustrates the idea that in the 
presence of discrete monitoring or jumps (but not in the case of continuous monitoring and 
continuous price processes) each kernel lends different characteristics to variance swap values. 
Partly for this reason a variety of kernels have been proposed in the literature. Bondarenko [3] 
introduces a kernel which lies between the squared log return and squared simple return def- 
initions. Bondarenko's proposal is motivated by the fact that variance swaps based on this 
kernel can be replicated perfectly in the presence of jumps and in discrete time. In a recent 
working paper, Neuberger [26] provides an alternative analysis for this type of payoff, introduc- 
ing the so-called aggregation property. Neuberger also shows that kernels with this property 
have a model-independent price. The kernel proposed by Carr and Corso [7] in the context 
of commodity markets, which is based on squared price differences, belongs to the same class. 
Recently Martin [24] has proposed yet another definition which is similar to the squared-return 
kernel but involves both the forward and the asset price. Our analysis covers all these kernels 
(though the kernel in [24] is only covered for the case of zero interest rates), and emphasises 
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that the impact of jumps depends crucially on the nature of the kernel. We find that kernels 
split into two classes - below we name them increasing and decreasing kernels — and the special 
properties of the Bondarenko kernel come from the fact that it lies in the intersection of these 
classes. 

Apart from asset price jumps, a further issue in the pricing and hedging of variance swaps 
is that the idealised continuous time limit may be a poor approximation to the traded contract 
which is based on discrete monitoring. For example, in [5] Broadie and Jain show that when the 
price path has negative jumps the value of the discretely monitored (log-return) variance swap 
can differ significantly from the value of the continuously monitored variance swap. Similarly, 
Bondarenko [3] investigates the hedging error that develops if the strategy of the classical 
continuous approach is approximated discretely, and reports replication errors of around 30 
percent. From a theoretical perspective, Jarrow et. al. [22] show that we may have that the 
price of the continuously monitored variance swap is finite, whilst simultaneously the discretely 
sampled analogue may has an infinite price, an observation which raises fundamental questions 
about the validity of using the continuous time integrated variance as an approximation for the 
discretely monitored quantity. These previous studies underscore the importance of a model- 
independent analysis, especially one based on a finite number of monitoring points. Again 
in the continuous set-up, Platen and Chen [28] show that variance swap values are infinite 
under realistic modelling assumptions and argue that this implies a risk of liquidity crises in 
financial markets. This article helps to quantify that risk: if call prices are such that the model 
independent upper bound for the variance swap is finite, then for all models which are consistent 
with the market data the variance swap value is finite. 

Recognising the importance of the jump contribution to variance swap values, Carr, Lee 
and Wu [10] show how it is possible to price and hedge a variance swap based on log returns 
if the asset price follows a Levy model. The analysis is extended to a more general class of 
variation swaps in [9]. Given a particular Levy model for the dynamics of the price path, Carr 
and Lee show that there exists a model-dependent adjustment to the multiplier 2 appearing 
in the classical continuous hedge such that the value of the variance swap is given by the new 
multiplier times the price of a log-contract. In general, this price is not enforceable through 
a hedging strategy. Moreover, since all models are wrong and since the adjustment of the 
multiplier depends on specifying a particular model, this approach may still significantly mis- 
price realised variance, even if the Levy model calibrates well to options prices. 

The appeal of the classical continuous hedge of Neuberger and Dupire is that, apart from 
price-path continuity, the only necessary assumption is that a log contract can be synthesised 
from put and call options, and then the option payoff can be replicated perfectly along each 
path. In this article, we continue to assume that regular payoffs can be replicated with vanilla 
options, but relax the continuity assumption. The prices of variance swaps are highly sensitive 
to the presence of jumps, and so this is an important advance. 

The remainder of the paper is structured as follows. In the next section we introduce the 
variance swap, and show how the definition depends on the form of the kernel. In Section 3 
we study the problem in the setting of continuous monitoring for a process with jumps. The 
understanding we develop in this section will motivate much of the subsequent analysis. Sec- 
tion 4 contains the main result, and shows how to construct a class of sub- hedging strategies. 
In Sections 5 and 6 we find the most expensive sub-hedge of this class for a given set of call 
prices, and thus we derive a model independent bound on the price of a variance swap, and 
then we show this bound is best possible, by showing that in the continuous time limit it can be 
attained. In Section 7 we extend our results from contracts written on forwards to include the 
case of contracts written on undiscounted prices. The penultimate section gives some numerical 
results and concluding remarks are given in Section 9. 
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2 Variance Swap Kernels and Model-Independent Hedging 



2.1 Variation swaps 

We begin by denning the payoff of a variance swap on a path- wise basis. The payoff will depend 
on a kernel, on the times at which the kernel is evaluated and on the asset price at these times. 

Definition 2.1. A variation swap kernel is a continuously differentiable bi-variate function 
H : (0, oo) x (0, oo) — > [0, oo) such that for all x € (0, oo), H(x, x) = = H y (x, x). We say that 
the swap kernel is regular if it is twice continuously differentiable. 

A variance swap kernel is a regular variation swap kernel H such that H yy (x,x) = x~ 2 . 

Our main focus in this article is on variance swap kernels but we will discuss variation swap 
kernels H (x, y) = (y — x) 3 and H®(x, y) = (y — x) 2 briefly, see Remark 3.1 and Example 6.10. 
(Strictly speaking H s is not a variation swap kernel since it is not non-negative, but most of 
our analysis still apllies in this case.) A regular variation swap kernel is a variance swap kernel 
if H(x, x(l + 5)) = 5 2 + o(5 2 ) for 5 small. Examples of variance swap kernels include H R (x, y) = 

" ''\ H L (x,y) = (log(y) - log(x)) 2 and H B (x,y) = -2 (\og(y/x) 1 ' 



xj \ \ x 



Definition 2.2. A partition P on [0, T] is a set of times = to < h < ••• < tN = T. A partition 



is uniform if t k = k = 0,1, ...N. A sequence of partitions V = {P^) n >i = {{t^; < k < 



N {n) }) n >i is dense if lim sup Itj^ - = 0. 
n t°°fce{o,...,7V(™)-i} 

Definition 2.3. A price realisation f = (/(t))o<t<T is a cadlag function / : [0,T] — > (0, oo). 

Definition 2.4. The payoff of a variation swap with kernel H for a partition P and a price 
realisation / is 

JV-l 

V H (f,P)=Y J H(f(t k ),f(t k+1 )). (2.1) 

fc=0 

Remark 2.5. (i) The price realisations / should be interpreted as realisations of the forward 
price of the asset with maturity T. Later we will extend the analysis to cover un-discounted 
price processes, rather than forward prices. 

(ii) Large parts of the subsequent analysis can be extended to allow for price processes which 
can take the value zero, provided we also define H(0, 0) = 0, or equivalently truncate the 
sum in (2.1) at the first time in the partition that / hits 0. In this case we must have that 
zero is absorbing, so that if f(s) = 0, then f(t) = for all s < t < T. 

(iii) In practice the variance swap contract is an exchange of the quantity V = Vn(f,P) for 
a fixed amount K. However, since there is no optionality to the contract, and since the 
contract paying K can trivially be priced and hedged, we concentrate solely on the floating 
leg. 

(iv) In many of the earliest academic papers, and in particular in Demeterfi et. al [13, 14], 
but also in some very recent papers, e.g. Zhu and Lian [30], the variance swap is defined 
in terms of the kernel H R . However, it has become market practice to trade variance 
swaps based on the kernel H L . Nonetheless these contracts are traded over-the-counter 
and in principle it is possible to agree any reasonable definition for the kernel. Variance 
swaps defined using the variance kernel H were introduced by Bondarenko [3], see also 
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Neuberger [26]. As we shall see, the contract based on this kernel has various desirable 
features. For continuous paths then in the limit of a dense partition the contract does not 
depend on the chosen kernel, see Example 6.10 and Lemma 6.9, but this is not the case 
in general. 

(v) The labels {S,Q, R, L, B} on the variation swap kernels denote {Skew, Quadratic, Re- 
turns, Logarithmic returns, Bondarenko} respectively. 
Let V = (P^) n >i be a dense sequence of partitions. If lim Vh (/, P^) exists then the limit 

n\oo 

is denoted Vj/(/, Poo) and is called the continuous time limit of Vjj(/, P^) on V . 

An important concept will be the quadratic variation of a path. For a dense sequence of 
partitions V, the quadratic variation [/] of / on V is defined to be [f] t = lim (/(ti+i) — 

n\oo — ' 
t[ n) <t 

f(t^)) 2 , provided the limit exists. We split the function into its continuous and discontinuous 
parts, [f]t = [f]1 + ^^(A/(u)) 2 . Later we will relate this definition to that introduced by 

u<t 

Follmer [17], which is used to develop a path-wise version of Ito calculus. 
2.2 Model independent pricing 

Our goal is to discuss how to price the variance swap contract, or more generally any path- 
dependent claim, under an assumption that European call and put (vanilla) options with ma- 
turity T are traded and can be used for hedging, but without any assumption that a proposed 
model is a true reflection of the real dynamics. In this sense the strategies and prices we derive 
are model independent and robust. 

Let call prices be given by C(K), expressed in units of cash at time T. We assume that 
a continuum of calls are traded, and to preclude arbitrage we assume that C is a decreasing 
convex function such that C(0) = /(0), C(K) > (f(0)-K) + and lim C(K) = 0, see e.g. Davis 

and Hobson [12]. We exclude the case where C(/(0)) = for then C{K) = (/(0) - K) + and the 
situation is degenerate: the forward price must remain constant and upper and lower bounds on 
the price of the variance swap are zero. Although we assume that calls are traded today (time 
0), we do not make any assumption on how call prices will behave over time, except that they 
will respect no-arbitrage conditions and that on expiry they will be worth the intrinsic value. 

Definition 2.6. A synthesisable payoff \s a function tp : (0, oo) i->- R which can be represented 
as the difference of two convex functions (so that ip"{x) exists as a measure). 

Let ^ = {if) : ip 6 VP} be the set of synthesisable payoffs ip : (0, oo) >->■ R. Then we have 
Hf) = W(0))+V4(/(0))(/-/(0))+ / (x-f) + r(x)dx+ [ (f-x) + r(x)dx. (2.2) 

■W(O)] A/(0),oo) 

where ^+ denotes the right-derivative. Thus we can represent the payoff of any sufficiently 
regular European contingent claim as a constant plus the gains from trade from holding a fixed 
quantity of forwards, plus the payoff of a static portfolio of vanilla calls and puts. 
Let D[0, t] denote the space of cadlag functions on [0,i]. 

Definition 2.7. A dynamic strategy for a fixed partition P is a collection of functions A = 
(5t , ■ ■ ■ , 5t N _ 1 ), where 5t j : D[0,tj] — > R. The payoff of a dynamic strategy along a price 
realisation / is 

N-l 

E M(/(*))o<t<t fc )(/(**+i) - /(**))■ (2-3) 

k=0 
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Let A(P) be the set of dynamic strategies. 

Definition 2.8. A = A(P) is a Markov dynamic strategy if $tj(f{i)o<t<tj) = fy,- (/(*?')) for all j. 
A Markov dynamic strategy is a time homogeneous Markov dynamic strategy (THMD-strategy) 
ii S tj (f(t j )) = 6(f(t j )) for sllj. 

In the sequel we will concentrate mainly on THMD-strategies. The quantity St ■ represents 
the quantity of forwards to be held over the interval (tj,tj+i\. In principle this quantity may 
depend on the current time and on the price history (f{t))o<t<tj- However, as we shall see, for 
our purposes it is sufficient to work with a much simpler set of strategies where the quantity 
does not explicitly depend on time, nor on the price history except through the current value. 
We call this the Markov property, but note there are no probabilities involved here yet. 

Definition 2.9. A semi-static hedging strategy (Y>,A) is a function ip £ ^ and a dynamic 
strategy A € A(P). The terminal payoff of a semi-static hedging strategy for a price realisation 
/is 

N-l 

WW) + Y, M(/(*))o<t<tJ(/(t*+i) " /(**))■ (2-4) 

Without loss of generality we may assume that ip'(f(0)) = 0. If not then we simply adjust 
each 5t k by the quantity ip'(f(0)) and the payoff in (2.4) is unchanged. In the sequel, we 
will concentrate on the case when A is a THMD strategy. Then we identify A 6 A(P) with 
5 : (0, oo ) — > R and write (ip, 5) instead of (if), A). 

Given that investments in the forward market may be assumed to be costless, the dynamic 
strategy has zero price. Thus, in order to define the price of a semi-static hedging strategy it 
is sufficient to focus on the price associated with the payoff function ip. The last two terms in 
(2.2) are expressed in terms of the payoffs of calls and puts. Thus we can identify the price 
of ip(f(T)) with the price of a corresponding portfolio of vanilla objects. We also use put- 
call parity 1 to express the cost of the penultimate term in (2.2) in terms of call prices. Let 
*0 = {V> € * : = °}- 

Definition 2.10. The price of a semi-static hedging strategy (ip € ^o, A € A(P)) is 
W(0))+ I *p"{x)(C(x) + f(0)-x)dx+[ ^"(x)C(x)dx. 

■W(O)] ■/(/(0),oo) 

The idea we wish to capture is that the agent holds a static position in calls together with a 
dynamic position in the underlying such that in combination they provide sub- and super-hedges 
for the claim. 

Definition 2.11. Let G = G((f(tk))k=o,...N) be the payoff of a path-dependent option. Suppose 
that there exists a semi-static hedging strategy (ip, A) such that on the partition P 

N-l 

G < (respectively >) ip(f{T)) + £ S tk ((f(t)) <t<t k )(f(t k+ i) ~ /(**))■ 

k=0 

Then (ip, A) is called a semi-static super-hedge (respectively semi-static sub-hedge) for G. 

Given a semi-static sub-hedge (respectively super-hedge) we say that the price of the sub- 
hedge (respectively super-hedge) is a model independent lower (respectively upper) bound on the 
price of the path-dependent claim G. 

1 This means that we do not need to introduce a notation for the put price, which is convenient since P is 
already in use for the partition. Put-call parity for the forward says that the price of a put with strike x is the 
price of a call with the same strike plus /(0) — x 
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2.3 Consistent models 

The aim of the agent is to construct a hedge which works path-wise, and does not depend 
on an underlying model. Nonetheless, sometimes it is convenient to introduce a probabilistic 
model and a stochastic process, and to interpret f(t) as a realisation of that stochastic process. 
In that case we work with a probability space F, f>) supporting the stochastic process 

X = (X t ) <t<T- 

Definition 2.12. A model (O,^ 7 , F, P) and associated stochastic process X — (^t)o<t<T is 
consistent with the call prices (C(K))k>o if (X t )t>o is a non-negative (F, p)-martingale and if 
E[(X T - K) + ] = C{K) for all K > 0. 

In the setting of a stochastic model Vh{X, P) : £1 — > M + is a random variable, and for 
oj £ Q, Vh(X(u}),P) is a realised value of a variance swap. From a pricing perspective we are 
interested in getting upper and lower bounds on E[Vff(X(c<;), P)] as we range over consistent 
models. Knowledge of call prices is equivalent to knowledge of the marginal law of Xt under 
a consistent model (Breeden and Litzenberger [4]). If we write (i for the law of Xt and if 
Cu(K) = E[(^u — K) + ] where is a random variable with law /j,, then X is consistent for the 

POO 

call prices C if C^(K) = C(K). We write m = xfi(dx) and we assume, using the martingale 

Jo 

property, that /(0) = m. Then the problem of characterising consistent models is equivalent to 
the problem of characterising all martingales with a given distribution at time T. 

3 Motivation 

3.1 The continuous case 

In the situation where both the monitoring and the price-realisations are continuous the theory 
for the pricing of variance swaps is complete and elegant. We will use this setting to develop 
intuition for the jump case. 

Suppose that the price realisation / is continuous, and possesses a quadratic variation [/] : 
[0, T] — > M + on a dense sequence of partitions V. Dupire [16] and Neuberger [25] independently 
made the observation that the continuity assumption implies that a variance swap with payoff 

/ f{t) 2 d[f] t can be replicated perfectly by holding a static portfolio of log contracts and 
Jo 

trading dynamically in the underlying asset. Both Dupire and Neuberger assume / = X is a 
realisation of a semi-martingale, but in our setting, the observation follows from a path-wise 
application of Ito's formula in the sense of Follmer [17], see Section 6. Applying Ito's formula 
to — 21og(/(t)) we have 

-2 log(/(T)) + 2 log(/(0)) = -2 £ J^df(t) + £ J^2 d ^- 
Then, as we show in Section 6 below, down a dense sequence of partitions 

V H (f,Poo) = £j^2 d if]t = -21og(/(T)) + 21og(/(0)) + £jL d f(t). (3.2) 

Provided it is possible to trade continuously and without transaction costs, the right-hand-side of 
this identity has a clear interpretation as the sum of a European contingent claim with maturity 
T and payoff —2 log(/(T)//(0)) and the gains from trade from a dynamic investment of 2/ fit) in 
the underlying. Alternatively, the right-hand-side of (3.2) can be viewed as the payoff of a semi- 
static hedging strategy in the continuous time limit for the choice ip(x) = —2 log(x//(0)) + 2(x— 
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/(0))//(0) and A = (<5 t ) < t < T where 5 t ((f(u)) < u < t ) = (2//(t)) - (2//(0)). Note that there 
is equality in (3.2) so that (ip,S) is both a sub- and super-hedge for Vjj(f, Poo)- In particular, 
under a price continuity assumption, the variance swap has a model-independent price and an 
associated riskless hedge. 



3.2 The effect of jumps on hedging with the classical continuous hedge 

Even if the continuity assumption cannot be justified, the associated replication strategy is 
nevertheless a reasonable candidate for a hedging strategy in the general case. Let us focus 
on the discrepancy between the payoff of the variance swap and the gains from trade resulting 
from using the hedge derived in the continuous case. The path-by-path Ito formula continues 
to apply in the case with jumps, see [17] and Section 6 below. Hence 

-21og(/(T)) + 21og(/(0)) = - 2 £ j^df(t) + £ j^d[f] c t 

Note that d[\og(f)] t = d[f} c t / f (t-) 2 + (A log(/(t))) 2 . By adding and subtracting the discontinu- 
ous part of the quadratic variation of log(/) on the right-hand-side of the above expression, we 
find 

-21og(/(T)) + 21og/(0) = -2 f jj^df(t) + [log(/)] T - J2 JUAf(t)/f(t-)) (3.3) 

Jo Jit-) < t < T 

where 

J L (rj) = -2r] + 2 log(l + n) + log(l + rj) 2 . 

It is intuitively clear, but see also Corollary 6.5, that V h l(J, Poo) = [log(/)]r- Then it follows by 
re-arrangement of equation (3.3) that the discrepancy between the realised value of the variance 
swap V h l(J , Poo) and the return generated by the classical continuous hedging strategy can be 
represented as the sum of the jump contributions: 

^(/,PJ-(-2.o g (/(r)) + 21og /(0, + 2j ( T ^/( ( ,) % £ j L (f^). 

We call this the hedging error with the convention that if the hedge sub-replicates the variance 
swap then the hedging error is positive. 

Now consider the kernel H R and define V H R{f,P OQ ) = / d[f]t/f(t— ) 2 , again, see Corol- 

Jo 

( Af(t)\ 2 

lary 6.5 for justification. By a similar analysis, but adding and subtracting f — — -J instead 
of the discontinuous part of the quadratic variation of log(/), we have 



V^«(/,Poo)-(-21og(/(T)) + 21og(/(0)) + 2^ T ^(i/(^ = g J R ( 



fit-)) 



where 

Jr(ti) = -2r/ + 21og(l +rj) + rj 2 . 
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In the continuous case, under some mild regularity conditions on / and V, the variance swap 
value is independent of the chosen kernel. In contrast, the value of a variance swap in the 
general case is highly dependent on the chosen kernel. 

To see that this is the case, and to examine the impact of jumps on the hedging error for the 
kernels H and H R we consider the shapes of the functions Jr and Jl, see Figure 1. For the 
kernel H L , a downward jump results in a positive contribution to the hedging error. Thus, if all 
jumps are downwards, then the classical continuous hedging strategy sub-replicates V h l(J, Poo)- 
Conversely, upward jumps result in a negative contribution to the hedging error. The story is 
reversed for the kernel H R . 







ID 








■ 


' -1 _ . 
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i ' -' _ '_ l ■ 
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Figure 1: Jl (as represented by the dashed line) is convex decreasing for x < and concave 
decreasing for x > 0. In contrast J# (solid line) is first concave increasing and then convex 
increasing. The different shapes of these two curves explains the different nature of the depen- 
dence of the payoff of the variance swap on upward and downward jumps for different kernels. 

It follows from the argument in the previous paragraph that for the kernel H L the hedging 
error will be maximised under scenarios for which the price realisation has downward jumps, but 
no upward jumps. Paths with this feature might arise as realisations of — N where N = (Nt)t>o 
is a compensated Poisson process. Moreover, from the convexity of Jl on (—1, 0), it is plausible 
that the scenarios in which the hedging error is maximised are those in which price realisations 
have a single large downward jump, rather than a series of small jumps. Again if we wish 
to minimise the hedging error we should expect a single large upward jump, and the story is 
reversed for the kernel H R . 

In summary, we find that, under a continuity assumption on /, and for a dense sequence of 
partitions, the value of a variance swap is independent of the kernel and can be replicated with 
a static hedge in a forward contract and a dynamic hedging strategy. In the presence of jumps, 
however, the value of the variance swap depends on the kernel. An agent who holds a variance 
swap and hedges under the assumption of continuity, may super-replicate or sub-replicate the 
payoff depending on the form of the jumps. For example, for the kernel H L an agent who acts 
as if the price realisation can be assumed to be continuous will sub-replicate the variance swap 
if there are downward jumps and no upward jumps. Such an agent will underprice the swap. 

We will use the analysis of this section to give us intuition about the extremal models which 
will lead to the price bounds on variance swaps derived in the Section 4. The bounds will 
depend crucially on the kernel. Models under which the variance swap with kernel H L has 
highest price (assuming consistency with a given set of call prices) will be characterised by a 
single downward jump and no upward jumps. 

Remark 3.1. We will see later that the model which minimises the price for variance swaps with 
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kernel H also minimises the price for variation swaps with kernel H . If / has a quadratic 
variation, then in the continuous limit V H s (/, -Poo) = ^ (A/(t)) 3 . This payoff will be smallest 

0<t<T 

if all jumps are downwards and we will see that if the call prices are given for expiry time T, 
then the model that produces the lowest price is one under which the price path has a single 
downward jump. 



3.3 A related Skorokhod embedding problem 



In this section we relate the problem of finding extremal prices for the variance swap to a 
Skorokhod embedding problem. Again the aim is to develop intuition which will guide the 
derivation of the optimal model-free hedges in the next section. 

Let /i be a measure on M + with mean m and let Ai^ be the class of all martingales such 
that for X £ M.^, Xq = m and Xt ~ \x. For each X £ M.^ there exists time-change t — > A t , 
null at 0, such that X t = B At for a Brownian motion B started at m. Suppose that we have a 
filtered probability space (f2, Q, G,P) such that B is a G-Brownian motion with Bq = m. Then 
X is adapted to the filtration F = (■J T t)t>0 where Tt = QA t - 



Let A c be the continuous part of A. Note that dA c t = {dX%f = d[X] c t . Let S x = {S?) t >o 
(respectively S) be the process of the running maximum of X (respectively B) so that = 
supAV Note that X t < < SA t and then, path-by-path with AB^ = B& t — BA t _, we have 

u<t 



> 



T dA c t 



d[X] c t v (AXt 



0<t<T v 



f ^B At 



(3.4) 



. (3.5) 



We suppose, for the moment, that \i has a second moment. Then (X t )o< t <T is a square- 
integrable martingale and we find that, 



E 



J o 



dA% 



+ E 



This motivates looking at the following problem: 



= E 

= E 

> E 



T dA c t + AA t 



T dA t 



f 

Jo 



du 



min E 



du 

J,, 



(3.6) 



where UI(/j>) is the class of stopping times such that B T ~ fi and Bt/\ T is uniformly integrable. 
This problem is a special case of a problem considered in Hobson and Klimmek [20], where it 
is proved that the minimum is attained by the Perkins embedding, which we will denote 7vf . 
Note that the Perkins solution of the Skorokhod embedding problem is generally defined for 
centred probability measures, but the translation to measures with non-zero mean equal to the 
non-zero starting point is trivial. 

Let / = (it)t>o denote the infimum process It = inf B u . 

u<t 



11 



Theorem 3.2. [Perkins [27], Hobson and Pedersen [21 J] Given v a probability measure with 
support on M + , with mean m let Z u denote a random variable with law v and define C u {z) = 
¥,[(Z U — z) + ] and P v {z) = K[(z — Z v ) + ]. Define also a = a v : (m, oo) i-> [0,m) and f3 = (3 U : 
(0, m) i y (to, oo) by 

, , ■ C u {z) - P v {y) . P v {z) - C p {y) 
a(z) = argmm — — — , p(z) = argmin — — — . (3.7) 

y<m Z — y y>m y — z 

Let B be Brownian motion started at m, with maximum process S and minimum process I. 
Suppose jji has no atom at m. Then t„ := inf{u > : B u < a u {S u ) or B u > (3 U (I U )} solves 
the Skorokhod embedding problem for v in the sense that B T p ~ v and (B tAr p)t>o is uniformly 
integrable. 

If v has an atom at to then we assume Fq is sufficiently rich as to support a uniform random 
variable Zu, which is independent of B. Then 



Zy< v{{m}) 

\ni{u >0:B U < a u {S u ) or B u > j3 u (I u )} Z v > v{{m}) 



solves the Skorokhod embedding for v. 

The Perkins embedding has a minimality property in that for increasing functions F it 
minimises E[i ? (5' r )] over embeddings r of v. Moreover, as shown in [20] it also minimises the 
expected value of functionals of the joint law of the running maximum and terminal value 
F(B T ,S T ) over stopping times r in UI(v), provided F satisfies some consistency conditions. 
The salient characteristic of the Perkins embedding which results in optimality is that either 
B T p = S T p or B T p = a u (S T p). 

Now consider the problem of finding the consistent model for which V h r(X, P^) has lowest 
possible price, and recall that knowledge of call prices is equivalent to knowledge of the marginal 
law n of Xt- To obtain the lowest possible price we might expect equality in each of (3.4)-(3.5), 
and thus that just before a jump, the process is at its current maximum. Moreover, the model 
should be related to the Perkins embedding. 

Lemma 3.3. Let B be Brownian motion started at m. Let Hf, = inf{n > : B u = 6} be the 
first hitting time of level b by Brownian motion. Let A(t) be a strictly increasing, continuous 
function such that A(0) = m and hm A(t) is infinite. 

Define the process = (Qf)o<t<T by 

® = B H Mt) ^ (3-8) 

and let be the right- continuous modification of . 

Then, is a martingale such that Qj, ~ fi. Moreover, the paths of are continuous 
and increasing, except possibly at a single jump time. Finally, either Qj, = B T p = S t p or 
Q^ = B T P=a^S T p). 

Proof. Since r^f is finite almost surely we have that Qt^ = B T p ~ j u. Moreover, for A(t) < 
Q^ = A(t) = B 7lA(t) = S 7lMt) . M □ 

The martingale will be used in Section 6 to show that in the continuous-time limit, the 
bounds we obtain are tight. The martingale Q M is the related to the Perkins embedding in the 
same way that the Dubins-Gilat [15] martingale is related to the Azema-Yor [1] embedding. 

We can also consider a reflected version of the martingale Q^ 1 based on the infimum process 
rather than the maximum process. 
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Lemma 3.4. Let X(t) be a strictly decreasing, continuous function such that A(0) = m and 

limA(i) is zero, 
tyr 

Define the process = (Rf)o<t<T by 

and let R^ be the right- continuous modification of R 11 . 

Then, i? M is a martingale such that Rj, ~ [i. Moreover, the paths of R^ are continuous 
and decreasing, except possibly at a single jump time. Finally, either Rt^ = B T p = I T p or 

r» t ^b tP = p,(i tP ). 

Remark 3.5. In this section we have exploited a connection between the problem of finding 
bounds on the prices of variance swaps and the Skorokhod embedding problem. This link is one 
of the recurring themes of the literature on the model-independent bounds, see Hobson [19]. 
We exhibit this link for the kernel H R , and in this sense at least, it seems that variance swaps 
defined via H R are the more natural mathematical object. Nonetheless, the intuition developed 
via H R and the Skorokhod embedding problem is valid more widely. 

4 Path-wise Bounds for Variance Swaps 

Previous sections have defined notation and developed intuition for the problem. Now we begin 
the construction of path- wise hedging strategies. We do this by defining a class of synthesisable 
payoffs with a useful extra property which can be exploited to give sub-hedges. Then, motivated 
by the results of Section 3.3, we define a further class of payoffs which are based on decreasing 
functions. Finally we show that for the kernel H R , members of this new class belong to the 
former class also, and thus yield sub-hedges. 

To construct a sub- hedge for a variation swap with kernel H for any price realisation /, 
suppose that there exists a pair of functions (Y>, <5) such that for x, y € R 

H(x, y) > iP(y) - ij>(x) + S(x)(y - x). (4.1) 

Then we may interpret (ip, 5) as a semi-static hedging strategy (for a Markov and time-homogeneous 
dynamic strategy) and then for any price realisation / and partition P, 

v H (f,P) > 4>{f{T)) - W(o)) - -f(t k )). 

k 

By Definition 2.11 we have constructed a sub-hedge for the variation swap with kernel H. 

Suppose now that H is a variance swap kernel, and that ip is differentiable. Recall that 
H y (x, x) = 0. Dividing both sides of (4.1) by y — x and letting y J, x, we find that S(x) < —i(j r (x). 
Similarly letting y f x, S(x) > —tp'(x). Thus if (4.1) is to hold we must have that S = —ip' and 
our search for pairs of functions satisfying (4.1) is reduced to finding differentiable functions if) 
satisfying 

H(x, y) > ip(y) - ip(x) - ip'(x)(y - x). (4.2) 

or equivalently, ip(y) < H(x, y) + tp(x) + ip' (x)(y — x). Note that there is equality in this last 
expression at y = x. 

Definition 4.1. tp £ is a candidate sub-hedge payoff if for all y £ (0, oo), 

i){y) = inf {H(x, y) + ^'(x)(y - x) + ^(x)} . (4.3) 
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Given a candidate sub- hedge payoff ip we can generate a candidate semi-static hedge (ip, 8) 
by taking 8 = —ip'- We will say that ip is the root of the semi-static sub-hedge (ip, —ip')- 

It remains to show how to choose candidate sub-hedge payoffs and especially those which 
have good properties. Using the intuition developed in the previous section for the kernel H R 
we expect optimal sub-hedging strategies to be associated with the martingale Q defined in 
(3.8). For realisations of Q, either the path has no jump, or there is a single jump, and if the 
jump occurs when the process is at x then the jump is to a(x). 

With this in mind let K. = /C(/(0)) be the set of monotone decreasing right-continuous 
functions k : [/(0),oo) — > (0,/(0)], with rc(/(0)) = /(0). Let k denote the inverse of k. For 
V < /(0) we want the infimum in (4.3) to be attained at x = k(y). Then ip must satisfy 

iP(y) = H(k(y), y) + iP(k(y)) + iP'(k(y))(y - k(y)). (4.4) 

Moreover, if ip' is differentiable, then for x = k(y) to be the argument of the infimum in (4.3) 
we must have that k satisfies H x (k(y),y) + ip"(k(y))(y — k(y)) = or equivalently 

H x (x,n(x)) = ip"(x)(x- n(x)). (4.5) 

This suggests that we can define candidate sub-hedge payoffs ip via (4.5) on (/(0),oo) and via 
(4.4) on (0,/(0)). 

If tp satisfies (4.2) then so does ip + a + b(y — x) for any a, b. Earlier we argued that without 
loss of generality for a semi-static hedging strategy we could assume ip'(f(0)) = 0. Now we may 
restrict attention further to ip with ip(f(0)) = 0. 

Define &(u,y) = H x (u,y)/(u — y). Write $ R (u,y) = H R (u,y)/(u — y), and similarly for 
other kernels. 

Definition 4.2. For k G K, with inverse k, define ip Kj H = V'k : (0> oo) i-> R + , by ip K (f(0)) = 
and 



i M*) iff (x-v)Q(u,n(u))du x>f(0) 
^ - J { iP K {k{z)) + iP' K {k{z)){z-k{z)) + H{k{z),z) z<f(0) 



We call such a function a candidate payoff' of Class fC. 

By convention we use the variable x on (/(0), oo) and z on (0, /(0)), to reflect the fact that 
ip is defined explicitly on the former set, but only implicitly on the latter. 

For the present we fix k and we write simply ip for ip K . Note that the value of ip(x) does 
not depend on the right-continuity assumption for k. Further, observe that if k is not injective 
and there is an interval A z = {x : k{x) = z} C (m, oo) over which k takes the value z 
then k has a jump at z. Nonetheless, the value of ip(z) does not depend on the choice of 
k(z). To see this, for x G A z consider ^(x) := ip(x) + ip'(x)(z — x) + H(x, z). Then, on A z , 
dW/dx = ip"(x)(z - x) + H x (x, z) = 0, using (4.5). 

Motivated by the results of Section 3.3 we have defined ip relative to the set of decreasing 
functions K, with the aim of constructing a sub-hedge. However, there are analogous definitions 
based on constructing super-hedges or using the martingale R or both. 

Definition 4.3. ip : (0, oo) — > (0, oo) is a candidate super-hedge payoff if for all y G (0, oo), 

iP(y) = sup {H(x, y) + ip'(x)(y - x) + iP{x)} . (4.6) 

X 

Define C = £(/(0)) be the set of monotone increasing functions t : (0,/(0)) — > (/(0),oo), 
with £{f{0)) = /(0). Let I be inverse to I. 
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Definition 4.4. For £ G C with inverse I, define ip£ : (0, oo) >->■ M + , the candidate payoff of 
Class C by ipe{f( )) = and 



Our next aim is to give conditions which guarantee that the semi-static strategy 
satisfies equation (4.1). 

Definition 4.5. A variation swap kernel H is an increasing (a decreasing) kernel if it is a regular 
variation swap kernel and 

(i) $>(u, y) is monotone increasing (decreasing) in y, 

(ii) H(a, b) + H y (a, b)(c - b) > (<)H(a, c) - H(b, c) for all a > b. 

The second condition in Definition 4.5 is equivalent to the fact that H yy (x,y) is increasing 
(decreasing) in its first argument. 

Example 4.6. H R and H s are increasing kernels and H L is a decreasing kernel. The kernels 
H B and H Q are simultaneously both increasing and decreasing since $ B (u,y) = 2u~ 2 and 
$^(u, y) = 2 do not depend on y and Condition (ii) in Definition 4-5 is satisfied with equality 
in both cases. 

Example 4.7. Consider the kernels H G ~(u,y) = uH R (u,y) and H G+ (u,y) = yH R (u,y). In 
the first case, variance is weighted by the pre-jump value of the price realisation and in the 
second case the variance is weighted by the post-jump value. Swaps of this type are known as 
Gamma swaps, see, for example, Carr and Lee [9]. Both Hq- and Hg+ are increasing kernels. 

Theorem 4.8. (i) (a) If H is an increasing kernel then every candidate payoff of Class K, 
is the root of a semi-static sub-hedge for the kernel H . 

(b) If H is an increasing kernel then every candidate payoff of Class C is the root of a 
semi-static super-hedge for the kernel H. 

(ii) (a) If H is a decreasing kernel then every candidate payoff of Class C is the root of a 
semi-static sub-hedge for the kernel H. 

(b) If H is an decreasing kernel then every candidate payoff of Class /C is the root of a 
semi-static super-hedge for the kernel H. 

Proof. We will prove the theorem in the case (i)(a). The proofs in the other cases are similar. 

Fix ft £ K let L K (x, y) = ip K (x) + il)' K (x)(y — x) + H(x, y) — ^ K (y). The result will follow if 
we can show that L K (x,y) > for all (x,y) G (0, oo) 2 . Since k is fixed we drop the subscript n 
in what follows. 

rx 

Suppose that x, z > /(0) and y G (0, oo). Since il)(x)+ip' {x){y— x) = / (y—u)&(u, n{u))du 




/(o) 



we have that 



L(x,y) - L(z,y) 



ip(x) + ip'{x)(y -x) + H(x, y) - ip(z) - ip'(z)(y - z) - H(z, y) 
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If y > / (0), then set z = y to find that 

rx 

L(x,y)= / {$(u,y) - <f>(u, k(u))} (u - y)du. 
•>y 

Since y > /(0) > k(u), &(u, y) > <&(u, n(u)) for all u. Hence L(x, y) > with equality at y = x. 

If y < /(0) and k is continuous at y set z = k(y). Otherwise, for definiteness set z = k(y+). 
Then L(k{y+),y) = and 

pX 

L(x,y)= / {®(u,y)-$(u,K(u))}(u-y)du. 

Jk(y+) 

If k(y+) < x then y > x, for all £ € k(x— )]. Then for u G < y and 

since <E>(ii, z) is increasing in z, the integrand is positive. 

If x < k(y+), then y < x for all a; G [k(x+), k(x— )]. Then for w G (x, fc(y+)) we have 
> y. Then again L(x, y) > 0. 

Finally, we show that L(x,y) > when x < /(0). Note that since, by what we have shown 
above, L(k(x),y) > it will suffice to show that L(x,y) > L(k(x),y). But, 

L(x,y) - L(k(x),y) = ip(x) + tp'(x)(y - x) + H(x, y) 

-tP(k(x)) - iP'{k(x))(y - k{x)) - H(k(x),y) 
= ip(k(x)) + tp'(k(x))(x - k(x)) + H(k(x),x) + ip'(k(x))(y - x) 

+H y (k(x),x)(y -x) + H(x, y) - ^(k(x)) - 4>'(k(x))(y - k(x)) - H(k(a 
= H(k(x),x) + H(x, y) + H y (k(x), x)(y - x) - H(k(x),y) 
> 0, 

where the last inequality follows from Definition (4.5). □ 

5 The most expensive sub-hedge 

In the next three sections we concentrate on lower bounds and increasing variance kernels, but 
there are equivalent results for upper bounds and/or decreasing variance kernels. 

In this section we fix the call prices and attempt to identify the most expensive sub-hedge 
from the set of sub- hedges generated by candidate payoffs of Class IC. The price of this sub- 
hedge provides a highest model-independent lower bound on the price of the variance swap in 
a sense which we will explain in the section on continuous limits. 

Associated with the set of call prices C(k) (and put prices C(k) + /(0) — k given by put- 
call parity) there is a measure /j, on M + with mean m. Since / is a forward price we must 
have /(0) = m. Write C = to emphasise the connection between these quantities. Then 

roo 

C(k) = C^(k) = (x — k)fi(dx). Recall that C M is convex so that fi(dx) = C^(x)dx with the 

J k 

right-hand-side to be interpreted in a distributional sense as necessary. We wish to calculate 
the cost of the European claim which forms part of the semi-static sub-hedge. By construction 

p pm poo 

this is equal to / ip(x)fi(dx) = / tp" ' {z){C '^(z) + m — z)dz + / tp" ' {x)C lJL {x)dx. 
Jr+ Jo Jm 

Proposition 5.1. For H a variance swap kernel and n G K.(m), 

poo pm poo 

I ^ K {x)n{dx) = I n(dz)H(m,z)+ duEtf* (k(u)) (5.1) 

Jo Jo Jm ' 
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where, for v < m < u, 

^\v) = v)C„{u) + / ii{dz){u - z) {$(«, 2) - $(u, v)} . 

J(0,v] 

Proof. Let ip = ip K . Note that by definition ip(m) = 0, so there is no contribution from mass at 
m and we can divide the integral on the left of (5.1) into intervals (0, m) and (m, oo). For the 
latter, 

poo poo px 

/ ip(x)n{dx) = I n{dx) j (x — u)<b(u, n{u))du 

Jm J m J m 

poo poo 

= / du$(u, k(u)) / (x — u)/j,(dx) 

J u=m Ju 
poo 

= / du^(u,K(u))C^(u) =: h. 

J u=m 

pm pk 

Now consider / ijj(z)n(dz). For this, using H(k, z) = H(m, z) + j H x (u, z)du and i/j(x) + 

Jo Jm 

px 

yj'(x)(z — x) = / du(z — u)<b(u, k(u)) we have 

J m 



pm pm pm fk(z) 

/ ip(z)fx(dz) = / n(dz)H(m,z) + / (i(dz) / du(u — z) {$(u, z) — $(u, k(u))} 

JO Jo Jo Jm 

=■ h + h 

Note that I2 depends on H but not on k. Moreover, 1$ does not depend on the particular values 
chosen for the inverse taken over intervals of constancy of k. (If x < x are a pair of possible 

px 

values for k(z) then / du{u — z){&(u, z) — &(u, k(u))} = since over this range k(u) = z.) 

J X 

Changing the order of integration we have 

poo p 

h= du fi(dz)(u — z){$(u,z) - <&(u,k(u))} , 

Jm J (0,k(u)] 

poo 

and then h + h = / duZ^f 1 {n(u)) . □ 



Our goal is to maximise the expression (5.1) over decreasing functions k € /C. As noted 

poo 

above, I2 is independent of k, and to maximise we can maximise S^'(k) 

J m 

separately for each u > m, and then check that the minimiser is a decreasing function of u. 

poo 

Proposition 5.2. Suppose H is an increasing variance swap kernel. Then / ip K (x)^i(dx) is 

Jo 

maximised over k 6 K, by k = a where a is the quantity which arises in (3.7) in the definition 
of the Perkins solution to the Skorokhod embedding problem. 

Proof. For u > m consider Q^\v) := C^(v) — / fi(dz)(u — z) defined for v G (0, u). Then for 

J(0,v] 

each u, 0^ is a strictly decreasing right-continuous function taking both positive and negative 
values on (0,m). Let k = k(u) = sup{v : 0^\v) > 0}. We have 6^(75-) > > 9j t u) (7c+). 

Suppose H is an increasing variance swap kernel so that &(u, y) is increasing in y. We want 
to show that T,^\v) is maximised by v = k(u). 
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Suppose m > v > k(u). We aim to show that for all k G (k(u),v) we have H^\v) < S^(k). 
We have 

eM(w)-EM(k) = *(u,v)Cp(u)+ T ii(dz)(u-z){*(u,z)-$(u,v)} 

Jo 

—$(u,k)Ch(u) — / n(dz)(u — z) {$(u, z) — k)} 
Jo 

rv 

= / n{dz)(u- z){<$>{u : z)-${u,v)} + [<$>{u,v)-$>{u^)\®^ ) {k). 

J K 

Since H is an increasing variance kernel, for z G (k,v), $>(u,z) < &(u,v), and the first integral 
is non-positive. Furthermore, &(u,v) > k) and ®( u \k) < 0. Hence we conclude that 

<£(«)(k). 

Similar arguments show that if v < ~k{u) then T^\v) < S^^(k) for any k G and 
it follows that k = k(u) is a maximiser of T,^\v). 

Note that k(u) is precisely the quantity a which arises in the Perkins construction. Hence 
k is a decreasing function. Moreover, the definition k(u) = sup{w : @^ (v) > 0} ensures that 7c 
is right continuous. □ 

Corollary 5.3. Suppose n n (x) is a sequence of elements of K. with K n (x) \, ~k(x). Then 
j ip Kn (x)fi(dx) converges monotonically to / i^k{x) fi(dx) . 

r rl roo 

Proof. Recall that / ip K (x)tx(dx) = / fi(dz)H(l, z) + / du^ (k(u)) . By the above 

J [0,oo ) JO J I 

arguments we have that T,^\z) is increasing in z for z > k(u). Hence the result follows by 
monotone convergence. □ 

Example 5.4. Let H = H , an increasing variance kernel. Let \x = U[0, 2] and let k : [1, 2] — > 
[0, 1] be given by k(x) = a^(x) = x — 2y/x — 1. Similarly we define £(x) = fi^{x) = x + 2y/l — x. 
Then (ip K , —ip' K ) is the most expensive sub-hedge of class K, and (ipg, —tp'g) is the cheapest super- 
hedge of class C Although we cannot calculate the functions i() K ,ipt explicitly, they can be 
evaluated numerically, see the left hand side of Figure 2. Now suppose H = H L . The roles 
of V'k an d ipe are reversed (see the right hand side of Figure 2) and (ip K , —*p' K ) is the root of a 
semi-static super-hedge and (ipg, —ip'i) is the root of a semi-static sub-hedge. 

6 Continuous limits and the tightness of the bound 

The bounds we have constructed based on the functions if) K hold simultaneously across all paths 
and all partitions. The purpose of this section is to consider the limit as the partition becomes 
finer. It will turn out that in the continuous limit there is a stochastic model which is consistent 
with the observed call prices and for which there is equality in the inequality (4.1) from which 
we derive the lower bound. In this sense the model-free bound is optimal, and can be attained. 

The analysis of this section justifies restricting attention to candidate payoffs of Classes tC 
and C. Hedges of this type either sub-replicate or super-replicate the payoff of the variance 
swap depending on the form of the kernel, but there could be other sub- and super-replicating 
strategies which do not take this form. In principle, for a given partition one of these other 
sub-hedges could give a tighter model-independent bound than we can derive from our analysis. 
(As an extreme example, suppose the partition is trivial (0 = to < t\ = T). Then Vjj(f,P) = 
H(f(0),f(T)) which can be replicated exactly using call options.) However, in the continuous 
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Figure 2: For the two kernels ip K is shown as a dashed line and ipt is shown as a solid line. For 
the kernel H R (left-hand-side), ip K is associated with a lower bound on the price of the variance 
swap. For the kernel H L (right-hand-side) is associated with an upper bound. 



limit our bound is best possible, so that when the partition is finite, but the mesh size is small 
we expect our hedge to be close to best possible and relatively simple to implement. 
For a finite partition _p( n ) in the dense sequence V = (P )n>l we have 

AT(«)-i a^™) — 1 

V H (f,P {n) )= £ H(f(t k )J(t k+1 ))>iP(f(T))-i>(f(0))- tf(f(tkW(tk+i)-f(t h )). 

k=0 k=0 

(6.1) 

We want to conclude that the limits Vn(f, Poo) = limVff(/,P^) and 

n 

lim V i>'(f(t k ))(f(t k+1 )-f(t k ))= i/j'(f(t-))df(t) (6.2) 
n to J ° 

exist for each path under consideration. Our analysis follows the development of a path-wise 
Ito's formula in Follmer [17]. Let et denote a point mass at t. 

Definition 6.1. A path realisation / has a quadratic variation on a dense sequence of partitions 
V = (P^ n ')ri>\ if, when we define the measure 

JV(™)-1 

Cn= (m + i)-f(t k )) 2 e tk , 

k=o, t t eP(") 

then the sequence Cn converges weakly to a Radon measure £ on [0, T]. Then ([f]t)t>o is given 

by [/]* = C([o,t]). " 

The atomic part of £ is given by squared jumps of /. Moreover the quadratic variation 
([f]t)t>o is simply the cumulative mass function of £■ 

Theorem 6.2. (Follmer [17]) Suppose the price realisation f has a quadratic variation along 
n >i and G is a twice continuously differentiable function from IR + to M, then 

/ G'{f{t-))df{t) = lim V G'(f(t k ))(f(t k+1 )-f(t k )) 
Jo "too ^ 
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exists and 

G(/(T))-G(/(0)) = F G '(f(s-))df(s) + \ f G"(f(s))d[f] c s 

JO 1 J(0,T] 

+ E i°(f^ ~ G (f( s ~)) - G'(f(s-))Af(s)] , 

s<T 

and the series of jump terms is absolutely convergent. 

Hence, provided if) is twice continuously differentiable on the support of / and / has a 
quadratic variation along V, it follows immediately that the limit in (6.2) exists. In our setting 
ip"(u) = $(u, k(u)) for u > 1, so that a sufficient condition for tp'^u) to be continuous on (1, oo) 
is that k is continuous. Further, on u < 1, provided k = kT 1 is differentiable and H y exists, we 
have ip'(z) = ip'(k(z)) + H y (k(z), z). Hence, sufficient conditions for ijj to be twice continuously 
differentiable on (0, 1) are that k is continuously differentiable, k is continuous and H xy and 
Hyy are continuous. Let tC c be the class of decreasing functions k : (/(0), oo) — > (0, /(0)) which 
are continuous and have an inverse k which is continuously differentiable. 

Corollary 6.3. Suppose that H is an increasing variance kernel, and that f has a quadratic 
variation. Suppose k G K. c and ip = ip K . Then the limit in (6.2) exists. 

Now we want to consider V H (f, Poo) = hm V H (f, P {n) ). 

n 

Lemma 6.4. Suppose H is a variance swap kernel. If V = (P^) n >i is a dense sequence of 
partitions, and f has a quadratic variation along V, then lim Vn{f,P^) exists and satisfies 

ntoo 

Vff(/,Poo)= / 77^[/]*+ E H(f(t-)J(t))- E 77^( A /W) 2 - ( 6 - 3 ) 

J(0,T]I(t-) 0< t<T^ t ~ ) 

Proof. Our proof follows Follmer [17]. Fix e > 0. Partition [0,T] into two classes: a finite class 
C\ = Ci(e) of jump times and a class C2 = ^(e) such that 

E (A/(.)) 2 < 6 2 . (6.4) 

se[o,T], sec 2 (e) 

aK«)-i 

Then E H U(tk), f(tk+i)) = E ff (/(**)' /(**+!)) + E #(/(**)> /(tfc+i)). where E indi - 

k=0 1 2 1 

cates a sum over those < k < — 1 for which (tfc, tfc+i] contains a jump of class Ci. It 
follows that 

lim^ff(/fe),/(i w ))= E H(f(t-)J(t)). (6.5) 

1 *6Ci(e) 

On the other hand, using the properties H(x, x) = 0, ^(x, x) = we have from Taylor's formula 
that H(x,y) = -H yy (x, x)(y — x) 2 + r(x, y). Using the fact that (f(t))o<t<T is a compact subset 
of (0, 00) we may assume that the remainder term satisfies \r(x, y)\ < R(\y — x\)(y — x) 2 where 
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R is an increasing function on [0, oo) such that R(c) — > as c — > 0. Then 

H(f(t k ), f{t k+l )) = ±£ H yy (f(t k ),f(t k ))(f(t k+1 ) - f(t k )) 2 + r(f(t k ), f(t k+1 )) 

2 2 2 

= ^E fl 'w(/(*fc)./(**))(/(tfc+i) - /(^)) 2 

1 

+E r (/(**)> /(**+!))■ ( 6 - 6 ) 

2 

Since H yy (f, f) = 2/f 2 is uniformly continuous over the bounded set of values (f(t))o<t<T, 

by (9) in Follmer [17], the first term in (6.6) converges to f — — r^d[f]t and the second term 

J(o,T] /(*-) 

converges to — ^ — — (Af(t)) 2 . Using (6.4) and the fact that the remainder term satisfies 
sec, i[t ~> 

\r(x, y)\ < R(\y — x\)(y — x) 2 we have that the last term is bounded by i?(e)[/]r- Finally, letting 
e | we conclude that V H (f, Poo) = lim V H (f, P (n) ) exists and (6.3) follows. □ 

n 

Corollary 6.5. V^/^) = / f(t-)- 2 d[f] t and V^P^) = [log/] T . 

J(o,T] 

Combining (6.1) with Theorem 6.2 and Lemma 6.4 it follows that for a path of finite 
quadratic variation and ip a twice-continuously differentiable function with ip(f(0)) = 0, 

V H (f, Poo) > i>{f{T)) - [ T ^'(f(t-))df(t). (6.7) 

J o 

The left hand side is the payoff of the variance swap in the continuous limit. The expression 
on the right can be interpreted as the payoff of a semi-static hedging strategy (ip, —tp r ) under 
continuous trading. From Definition 2.10 for each of the partitions in the sequence we have that 
the price of the semi-static hedge is 



poo poo /*/(o) 

/ ^{x)n{dx)= ^j"(x)C II (x)dx+ ^j"(z)(C fl (z) + f(0)-z)dz. (6.8) 

J Jf(o) Jo 



Since this value does not depend on the partition, in the continuous-time setting we define the 
price of sub-hedge (if), —ip') to also be the expression given in (6.8). 

Corollary 6.6. Suppose H is an increasing variance swap kernel. A model-independent lower 
bound on the price of the continuous time limit of the variance swap with payoff Vn(f) is 

roo roo 

sup / ip K (x)fj,(dx) = / ip a/ ,(x)/J,(dx) (6.9) 
« Jo Jo 

where is the quantity arises in the Perkins embedding (Theorem 3.2). 

poo 

Proof. For any decreasing function k G JC c we can construct ip K such that / vp K (x)n(dx) is the 

Jo 

price of a sub-hedge for Vh for any partition, and this continues to hold in the continuous-time 

poo 

limit. Moreover, by optimising over k we obtain a bound / ip a)1 (x)n(dx) which is the best 

Jo 
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bound of this form by Proposition 5.2. Note that even if is not in class /C c , by Corollary 5.3 
we can approximate it from above by a sequence of elements of class JC C such that in the limit 

/•oo 

we obtain the price / ip afi (x)n{dx) as a bound. □ 
Jo 

Our goal now is to show that this is a best bound in general and not just an optimal bound 
based on inequalities such as (6.1) for tp = ip K and k a decreasing function. We do this by 
showing that there is a consistent model for which the price of the continuously monitored 

[•OO 

variance swap is equal to / ip a ^(x)ii(dx). 

Jo 

Theorem 6.7. There exists a consistent model such that 

V H ((X t ) <t<T,Poo) = ^,(Xt) ~ f T tf a (X 8 J)dX a . (6.10) 

Jo 

Proof. Recall Definition 2.12 and note that we are given a set of call prices and that in con- 
structing a consistent model we are free to design an appropriate probability space (fi, F, F = 
(Ft)o<t<T,P) as well as a stochastic process (X t )t>o- 

Suppose we are given call prices C(x) = C^(x) for some fi. Let (fi, Q,G = (Gt)o<t<T,P) 

support a Brownian motion (W u ) u >o with initial value Wo = /(0) = / xfi(dx) and suppose Go 

Jr+ 

contains a U[0, 1] random variable which is independent of W. (This last condition is necessary 
purely to ensure that the Perkins embedding of \i can be defined when [/, has an atom at /(0). 
If \i has no atom at /(0) then we may take Qo to be trivial.) 

Let be the Perkins embedding of fi in W. Write S for the maximum process of W 
so that S u = maxW„. Write H x for the first hitting time by W of x. Let (A(t))o<t<T be a 

v<u 

strictly increasing continuous function with A(0) = /(0) and limA(t) = oo. Now define the 
left-continuous process X = (X t )o<t<T via 



X, 



A(t) H A{t) < T ? 



W t p T? < H A{t) . 



Note that the condition i?A(t) ^ r /f can be re-written as A(t) < S T p or equivalently t < 

A~ 1 (S t p). Define also Ft = Gff., .■ Then X is adapted to the filtration F = (Ft)o<t<T and X 

is a ^-martingale for which Xt = W t p ~ fi. 

In order to construct a right-continuous martingale with the same properties, for t < T we 
set Ft = C) u> tFt and Xt = limX u , and for t = T we set Ft = Ft Xt = Xt- Then X is a 

right-continuous F martingale such that (Q, F, F = (Ft)o<t<T,f) is a consistent model. 

Now we want to show that for this model (6.10) holds path-wise. Writing ip for ip a ^, and 
Xt as shorthand for each Xt{oj) we have for each oo 

i>(X T )- [ ^(Xt-)dXt = i>(W p)-[ T ^ ip' (A(t))dA(t) — ip' (S t p)(W t p — S t p) 
Jo M Jt=o 

= 4>(W p) - f T * tf{u)du - ^(S t p){W t p - S t p) 

7/(0) 

= ^(W t p) - ^(S t p) - ^'(S t p){W t p - S t p). 
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There are two cases. Either W t p = S T p, in which case this expression is equal to or, 
W t p = a^{S T p) and then the expression becomes 

Y^cvOO) - V'(s) - ^'(s)(a ll {s) -s) = H(s,a(s)) 

at s = S T p, using Definition 4.2. In either case the right hand side of (6.10) is H(S t p,W t p). 
For the left hand side of (6.10), [X] C T = and (AX U ) 2 = {S t p - W t p) 2 1 {u=a -i (s p)} l { w p ±s P \ 
so that from (6.3), Vff(/, Poo) = H(S t p,W t p). Hence (6.10) holds path-wise. 

□ 

Corollary 6.8. Suppose H is an increasing variance swap kernel. Then the highest model 
independent lower bound on the price of a variance swap is given by the expression in (6.9). 

Corollary 6.9. If<&(u, y) does not depend on y then the corresponding variance swap is perfectly 
replicable by (tp, —ip'). For all consistent models the variation swap has price / ip(x)/j,(dx). 

Example 6.10. Recall the definitions of the kernels H B and and Example 4-6. & B (u, y) = 
2u~ 2 and so ip'(u) = —2/u and ip(u) = —2 log(u). Thus H B (x, y) = ip(y) — ip(x) — ip'(x)(y — x) 
and the strategy (ip, —ip') replicates the payoff perfectly for any price realisation. The observation 
that H B has one model-independent price was first made by Bondarenko in [3j. Similarly, 
H®(x,y) = ip(y) — ip(x) — ip'(x)(y — x), where ip(x) = x 2 . An alternative analysis of these 
two payoffs is due to Neuberger [26]. Neuberger introduces the aggregation property. Translated 
into the notation of our setting, a kernel enjoys the aggregation property if K[Vh(X, P^)] = 
~E[H(Xt — Xq)]. Both Bondarenko [3] and Neuberger [26] advocate the use of H B due to the 
fact that its price is not sensitive to the price path, but only to the value of Xt- 

7 Non-zero interest rates 

To date we have worked with forward prices. This has the implication that the dynamic part 
of a hedging strategy has zero cost. In this section we outline how our analysis can be extended 
to non-zero, but deterministic, interest rates. 

Suppose that interest rates are deterministic. Let D t = D t (T) be the discount factor over 
[t,T] so that the asset price realisation (s = (st)o<t<T) and the forward price realisation are 
related by s(t) = D t f(t). In the case of constant interest rates D t (T) = e~ r(yT ~^ so that 
a (t) = e-'< T -*>/(t). 

Let P be a partition of [0,T]. For k G {0, l,...,iV — 1} write s^ = s(tk), = /(£&) and 
Dk = D tk (T). Set = Dk+i/Dj-. Note that if interest rates are non-negative then 

Dk,k+i > 1- 

Let G be the kernel of a variation swap and write Gk(x, y) = G(DkX, Dky). Then the payoff 
of the variance swap is given by 

N-l N-l 

Vg{s,P) = ^2 G(D k f k ,D k+1 f k+1 ) = ^2 G k {f k ,D ky k+ifk+i)- 

k=0 k=0 

Proposition 7.1. Suppose that there exists a variation swap kernel H, functions rj, e, B and 
a constant A G R such that for all D > 

G k (x, yD) > AH(x, y) + n{y) - V (x) + e(x, k, D)(y - x) + B(k, D). (7.1) 

Without loss of generality we may take r](f(0)) = 0. 
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Suppose that there exists a semi-static sub-hedging strategy (tp, A) for the variation swap 
with kernel H . Then 

V G (s, P) > (A*/, + r?)(/(T)) + k, D ktk+1 ) + 6 tk ((f(t) t <t k )] (fk+i - fk) + £ B ^ D ",k+i), 

k k 

and there is a model-independent sub-hedge and price lower bound for Vq- 
Proof. We have 

N-l 

V G (S,P) = Y, G ^fk:D Kk+1 f k+l ) 
k=0 

> AhK/(T)) + £M(/(*)*<i fc )(A+i " + ^(/( T )) 
fc 

+ X] e (^ fe ' D k,k+\){fk+i ~ fk) + ^2 B(k, D kyk+ i) 
k k 

□ 

Remark 7.2. If we are content to assume that interest rates are non-negative then we only need 
(7.1) to hold for D > 1. 

Remark 7.3. The price for the floating leg associated with the hedge is the price of the static 

N-l 

vanilla portfolio with payoff (Aip + n)(f(T)) plus the constant ^ B(k, D kjk+ i). 

k=0 

Corollary 7.4. Suppose H is an increasing variance kernel, and ip is of Class K.. If (7.1) holds 
then we have a path-wise sub-hedge and a model independent bound on the price of Va- 
in the setting of increasing or decreasing variance kernels the bound in (7.2) will be tight 
provided (ip,—tp') is a tight semi-static hedge for Vn(f,P) and there is equality in Equation 
(7.1). 



(y-x) 
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Example 7.5. Suppose G(x,y) = H (x,y) = ^ — . Then G k (x,y) = G(x,y), so that 

e(x,k,D) and B(k,D) will not depend on k. Moreover, 

G(x,yD) = \{Dy - Dx + Dx - x) 2 

or 

= W^Y ] + D^l(y-x) + (D-ir 
\ x I x 



Suppose that interest rates are non-negative so that D k k+ i > 1. Then (7.1) holds for A = 1, 
V = 0, e(x, D) = D(D - l)/x and B(D) = (D - l) 2 . 

Note that there is an inequality in (7.1) for A = 1. If D kk+ i is independent of k (the natural 
example is to assume that interest rates are constant and the partition is uniform, in which case 
d = \ogD k k+ i = rT/N) then we can have equality by taking A = e 2rT ^ N . In that case we have 
an improved bound, but the improvement becomes negligible in the limit N t oo. 

Example 7.6. Suppose G(x,y) = H L (x,y) = (log(y) — log(x)) 2 . Then G k (x,y) = G(x,y) and 
G(x,yD) = (log L> + logy -logx) 2 = H L (x, y) + 21og£>(logy - log a;) + (logD) 2 . 

Suppose now that the partition is such that D k k+ \ is independent ofk, and set d = log D k k+ \ . 
Then Equation (7.1) holds with equality for A = 1, rj(y) = 2d logy, e = and B(D) = d 2 . 
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Example 7.7. Suppose G(x,y) = H B (x,y) = — 2(logy — logx) — (y/x— 1)). Then G k {x,y) = 
G(x, y) and 

G{x,yD) = -2{\ogy-\ogx + \ogD) + 2D{y-x) + 2{D -1) 
= H B (x,y) + 2(D-l)(y/x-l) + H B (l,D). 

Then Equation (7.1) holds with equality for A = I, rj(y) = 0, e(x, D) = 2{D — l)/x, B(D) = 
H B (1,D). 

We can consider the limit as the partition becomes dense, in which case the bounds for the 
variance swap become tight. For definiteness we will assume that we cave a sequence of uniform 
partitions with mesh size tending to zero, and that interest rates are constant, though this can 
be weakened for the squared return and Bondarenko kernels. 

JV-1 

Then, for each of the three examples above we have that ^ B(k,D k;k+1 ) = NB(e rT/N ) -»• 

k=0 

0. Further, in each case r)(y) — > 0, and A = 1. Then in the limit the lower bound on the price 
of the variance swap based on the price realisation s is the same as the upper and lower bounds 
for the variance swap defined relative to the forward price /. Thus, for variance swaps based 
on frequent monitoring, the bounds we have calculated in earlier sections based on the forward 
price may also be used for undiscounted price processes. 

7.1 Super-hedges and upper bounds 

Corollary 7.8. Suppose there exists H , r\, e, B, and A such that 

G k (x, yD) < AH(x, y) + V (y) - V (x) + e(x, k, D)(y - x) + B(k, D), (7.2) 

and suppose that there exists a semi-static super-hedging strategy (ip, A) for the variation swap 
with kernel H. Then there is a corresponding model-independent super-hedge and price upper 
bound for Vq . 

The analysis of the kernels H R , H L , H B and upper bounds is similar to that in Exam- 
ples 7.5 — 7.7 above. For the kernel H B , the choices listed in Example 7.7 give equality in 
(7.2) and can be used equally for upper bounds. Provided that we have an upper bound for 
Dk,k+i, so that D k ,k+i ^ D uniformly in k, for the kernel H R we may take A = D 2 , rj = 0, 
e(x,D) = D(D — l)/x and B(D) = (D — l) 2 . Finally, for H L , provided interest rates are 
non-negative, we can write 

G(x,yD) = H L (x,y) + 21ogD(logy - logs) + (logD) 2 < H L (x,y) + 2^(y - x) + (logD) 2 

so that (7.2) holds for A = 1, rj = 0, e(x,D) = 2(\ogD)/x and B(D) = (logD) 2 . Note that, 
unlike for the lower bound in Example 7.6, for the upper bound we do not need to assume that 
Dk t k+i is independent of k. 

Remark 7.9. In his analysis of lower bounds for the kernel H L , Kahale [23] does not need to 
assume the partition is uniform and that interest rates are constant (or more generally that 
B>k,k+\ is constant), and can allow for arbitrary finite partitions and deterministic interest 
rates. Our results complement his results nicely. Although we need the assumption that Dk,k+\ 
is constant to recover Kahale's result in the setting of lower bounds and the kernel H L , in all 
other cases of study (upper bounds for V h l and upper and lower bounds for V h r and V h b) our 
methods also allow for arbitrary partitions and non-constant but deterministic interest rates. 
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8 Numerical Results 



Given a continuum of call prices, it is possible to calculate the model independent bounds 
for the prices of variance swaps. When the implied terminal distribution of the asset price is 
simple it is sometimes possible to calculate the monotone functions associated with the Perkins 
embedding explicitly (see Example 5.4) and to obtain a closed form integral expression for the 
model independent upper and lower bounds. For more realistic and complex target laws, the 
monotone functions and bounds can still be calculated numerically. The case when the terminal 
law is lognormally distributed is of particular practical interest. 

A standard time frame for a volatility swap is 30 days or one month (T = 1/12), which 
is the time frame used for the widely quoted VIX index. Figure 3 plots the upper and lower 
bounds for the prices of variance swaps based on the kernels Hr and Hl relative to the cost of 
—2 log contracts (the Neuberger/Dupire price of the standard hedge or 'VIX price') against the 
volatility parameter of the lognormal (terminal) distribution centered at 1. More precisely, the 
bounds are plots of 

a^E[^ K ,H(X a/ ^)]/E[-2logX a/ ^], and a -> E^^^/El^logX^], 

where X a = e aN ~ a2 / 2 [ s the lognormal random variable with volatility parameter a and H = 
H R or H L . Here, ipx,H is the function given in Definition 4.2 and k is chosen according to 
Proposition 5.2 (with £ chosen similarly). Thus the upper bound for the kernel Hl and the 
lower bound for the kernel Hr correspond to the decreasing function k associated with the 
Perkins embedding, while the other two bounds are constructed with the increasing function £ 
associated with the reversed Perkins embedding. 

Note that the price of a variance swap in the Black-Scholes model (as given by E[— 2 log X^^]) 
is an increasing function of volatility. The upper and lower bounds are also increasing functions 
of volatility, and, as can be seen in the figure, they also become wider as volatility increases, 
when expressed as a ratio against the no-jump case. For reasonable values of volatility, and for 
both kernels, the impact of jumps is to affect the price by a factor of less than two, and for the 
kernel H L the bounds are even tighter. The observation that the bounds for the kernel Hr are 
wider than those for the kernel Hl is partly explained by considering the leading term in the ex- 
pansion of the hedging error (see Section 3.2). We have Jr(x) 2x 3 /3 whereas Jl(%) ~ — x 3 /3 
so that the magnitude of the leading error term for Hr is twice that of the leading error term 
for Hl- Note that for the optimal martingales the jumps are not local, so this approximation 
becomes less relevant as a increases. 
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Figure 3: Model independent upper and lower bounds for the prices of variance swaps based on 
the kernels Hl (solid lines) and on Hr (dashed lines) relative to the price of —2 log contracts 
(dotted line) in the case when the terminal distribution is lognormal with volatility between 
and 0.5. Here T = 1/12 and we work with variance swaps on forward prices. 



9 Summary and concluding remarks 

This article developed from an attempt to express the results of Kahale [23] on no-arbitrage 
lower bounds for the prices of variance swaps in the framework of model-independent hedging, 
in which extremal models and prices are associated with extremal solutions of the Skorokhod 
embedding problem. Beginning with Hobson [18], the focus in this literature is on hedging, 
and on finding pathwise inequalities relating the payoff of the exotic, path-dependent derivative 
and the payoff of a static vanilla call portfolio combined with the gains from trade from an 
investment in the underlying security. In the context of variance swaps we find that the lower 
bound is associated with a martingale price process which can be expressed as a time-change 
of the Perkins solution of the Skorokhod embedding problem. This embedding has appeared 
previously in finance in the construction of model-independent bounds for the prices of barrier 
options (Brown et al [6]). 

We approach the problem of finding hedging strategies in a more general setting than Ka- 
hale [23] in that we consider a variety of kernels in the definition of the variance swap. The 
ability to consider general kernels allows us to emphasise the dependence of the payoff on the 
presence and character of the jumps, and to show that the nature of this dependence is strongly 
influenced by the form of the kernel. Bondarenko [3] and Neuberger [26] argue that the finance 
industry should consider defining variance swaps using the kernel H B as then they can be repli- 
cated perfectly, even in the presence of jumps, recall Example 6.10. The counterargument is 
that variance swaps provide value precisely because they are not redundant in this way. Sophis- 
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ticated investors want to be able to take positions on the likely presence and direction of jumps. 
This is possible if the variance swap is defined using the kernel H R or H L , but not using H B . 

Kahale [23] only considers the kernel H L , and lower bounds and sub-replicating strategies. 
On the other hand he works directly with the undiscounted asset price, and does not give 
special attention to contracts written on the forward price. He introduces the class of ^-convex 
functions which have the property that each such function gives a lower bound on the price of 
the variance swap, and an associated sub-hedge. He then proceeds to show that functions ip of 
Class C (in our notation) are F-convex. In this way he can deduce a lower bound on the price 
of a variance swap. Further, for a particular choice of decreasing function he can show that this 
lower bound can be attained in the continuous time limit under a well-chosen stochastic model 
— hence the bound he attains must be a best bound. 

In contrast, initially we consider contracts based on the forward price. This simplifies the 
analysis significantly and reduces the search for candidate sub-hedge payoffs to a search for 
functions satisfying (4.3). The condition (4.3) is considerably simpler than the corresponding 
condition for V-convexity in Kahale [23, Equation (3.1)]. The fact that we have a more trans- 
parent representation of the key property allows us to find candidate super-hedge payoffs quite 
easily and allows us to extend the analysis to general variation swap kernels provided they have 
a monotonicity property. Moreover, we can easily develop upper bounds to complement the 
lower bounds. Only later do we introduce interest rates and variance swaps written on the 
undiscounted asset price, at which point we find simple inequalities which extend our bounds 
to the general case. In the limit of a dense sequence of partitions the same bounds are optimal 
in both the undiscounted and forward price settings. We believe that the two-stage approach 
brings insight, not least because in the forward case there is a direct link to martingales and 
solutions of the Skorokhod embedding problem, and because inequalities such as (7.1) allow 
us to quantify the price difference between contacts written on the undiscounted and forward 
prices for discrete monitoring. 

A further contribution of this article is to provide a derivation of bounds on the prices 
of variance swaps without any recourse to probability. This involves construction of a class 
of hedges parameterised by monotone functions, and the choice of an optimal element in this 
class for a given set of call prices, together with Follmer's non-probabilistic Ito calculus. Price 
trajectories for which the bound is path-wise tight have at most one jump, after which the 
trajectory is constant. Probability is only required to show that these trajectories correspond 
to a stochastic model for the price process. The relationship between the optimality of the 
cheapest hedge, derived in a purely non-proabilistic fashion, and the optimality of the Perkins 
embedding provides a pleasing completeness to the story. 

References 

[1] J. Azema and M. Yor. Une solution simple au problcmc dc Skorokhod. In Seminaire de 
Probabilites, XIII (Univ. Strasbourg, Strasbourg, 1977/78), volume 721 of Lecture Notes in 
Math., pages 90-115. Springer, Berlin, 1979. 

[2] A. Bick and W. Willinger. Dynamic spanning without probabilities. Stochastic Processes 
and their Aplications, 50:349-374, 1994. 

[3] O. Bondarenko. Variance trading and market price of variance risk. Working paper, 2007. 

[4] D.T. Breeden and R.H. Litzenberger. Prices of state-contingent claims implicit in option 
prices. J. Business, 51:621-651, 1978. 



28 



[5] M. Broadie and Ashish. Jain. The effect of jumps and discrete sampling on volatility and 
variance swaps. Int. J. of Th. and App. Finance, 11(8):761-791, 2008. 

[6] H. Brown, D.G. Hobson, and L.C.G Rogers. Robust hedging of barrier options. Math. 
Finance, 11(3):285-314, 2001. 

[7] P. Carr and A. Corso. Covariance contracting for commodities. Energy and Power Risk 
Management, April:42-45, 2001. 

[8] P. Carr and R. Lee. Volatility derivatives. Annual Rev. Financ. Econ., 1:313-339, 2009. 

[9] P. Carr and R. Lee. Variation and share-weighted variation swaps on time-changed Levy 
processes. Preprint, 2010. 

[10] P. Carr, R. Lee, and L. Wu. Variance swaps on time-changed Levy processes. Preprint, 
2010. 

[11] A. Cox and J. Wang. Root's barrier: construction, optimality and applications to variance 
options. Preprint, 2011. 

[12] M.H.A. Davis and D.G. Hobson. The range of traded option prices. Mathematical Finance, 
17(1):1-14, 2007. 

[13] K. Demeterfi, E. Derman, M. Kamal, and J. Zou. A guide to volatility and variance swaps. 
The Journal of Derivatives, 6(4):9-32, 1999. 

[14] K. Demeterfi, E. Derman, M. Kamal, and J. Zou. More than you ever wanted to know 
about volatility swaps. Goldman-Sachs Quantitive Strategies Research Notes, 1999. 

[15] L.D. Dubins and D. Gilat. On the distribution of maxima of martingales. Proceedings of 
the American Mathematical Society, 68:337-338, 1978. 

[16] B. Dupire. Arbitrage pricing with stochastic volatility. Societe Generale, Options Division, 
Paris, 1992. 

[17] H. Follmer. Calcul d'lto sans probabilites. In Seminaire de Probabilites, XV (Univ. Stras- 
bourg, Strasbourg, 1981), volume 15 of Lecture Notes in Math., pages 143-150. Springer, 
Berlin, 1981. 

[18] D.G Hobson. Robust hedging of the lookback option. Finance and Stochastics, 2:329-347, 
1998. 

[19] D.G. Hobson. The Skorokhod embedding problem and model independent bounds for 
option prices. In Paris-Princeton Lecture Notes on Mathematical Finance. Springer, 2010. 

[20] D.G. Hobson and M. Klimmek. Maximising functionals of the joint law of the maximum 
and terminal value in the Skorokohd embedding problem. Preprint, 2010. 

[21] D.G. Hobson and J. L. Pedersen. The minimum maximum of a continuous martingale with 
given initial and terminal laws. Ann. Probab., 30(2):978-999, 2002. 

[22] R. Jarrow, Y. Kchia, M. Larsson, and P. Protter. Discretely sampled variance and volatility 
swaps versus their continuous approximations. Preprint, 2011. 

[23] N. Kahale. Model-independent lower bound on variance swaps. Preprint, 2011. 



29 



[24] I. Martin. Simple variance swaps. Preprint, 2011. 

[25] A. Neuberger. The Log Contract. Journal of Portfolio Management, 20(2):74-80, 1994. 
[26] A. Neuberger. Realized skewness. Working paper, 2010. 

[27] E. Perkins. The Cereteli-Davis solution to the f/^-embedding problem and an optimal 
embedding in Brownian motion. In Seminar on Stochastic Processes, 1985 (Gainesville, 
Fla., 1985), pages 172-223. Birkhauser Boston, Boston, MA, 1986. 

[28] E. Platen and L. Chang. A cautions note on the design of volatility derivatives. ArXiv, 
http://arxiv.org/abs/1007.2968vl, July 2010. 

[29] A. V. Skorokhod. Studies in the theory of random processes. Translated from the Russian 
by Scripta Technica, Inc. Addison- Wesley Publishing Co., Inc., Reading, Mass., 1965. 

[30] S. Zhu and G-H. Lian. A closed-form exact solution for pricing variance swaps with stochas- 
tic volatility. Math. Finance, 11:233-256, 2011. 



30 



